AITopics | epipolar line

A Global Depth-Range-Free Multi-View Stereo Transformer Network with Pose Embedding

Neural Information Processing SystemsNov-19-2025, 17:31:24 GMT

In this paper, we propose a novel multi-view stereo (MVS) framework that gets rid of the depth range prior. Unlike recent prior-free MVS methods that work in a pair-wise manner, our method simultaneously considers all the source images. Specifically, we introduce a Multi-view Disparity Attention (MDA) module to aggregate long-range context information within and across multi-view images.

artificial intelligence, machine learning, natural language, (16 more...)

Neural Information Processing Systems

Country:

Europe > Netherlands > North Holland > Amsterdam (0.04)
Asia > China (0.04)

Genre: Research Report > Experimental Study (0.93)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)

Add feedback

Era3D: High-Resolution Multiview Diffusion Using Efficient Row-wise Attention

Neural Information Processing SystemsNov-19-2025, 07:35:28 GMT

In this paper, we introduce Era3D, a novel multiview diffusion method that generates high-resolution multiview images from a single-view image.

arxiv preprint arxiv, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Oklahoma > Beaver County (0.04)
Asia > Japan > Honshū > Chūbu > Nagano Prefecture > Nagano (0.04)
Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
Asia > China > Hong Kong (0.04)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.67)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(2 more...)

Add feedback

Epipolar-Free 3D Gaussian Splatting for Generalizable Novel View Synthesis

Neural Information Processing SystemsNov-16-2025, 13:47:49 GMT

Alignment method to ensure consistent depth scales across different views.

artificial intelligence, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country:

Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
North America > United States > Oklahoma > Beaver County (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
(8 more...)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)
Research Report > Promising Solution (0.67)

Industry:

Information Technology (0.67)
Health & Medicine (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

38e511a690709603d4cc3a1c52b4a9fd-Paper-Conference.pdf

Neural Information Processing SystemsNov-13-2025, 23:53:41 GMT

artificial intelligence, machine learning, transformer, (13 more...)

Neural Information Processing Systems

Country: Asia > China > Guangdong Province > Shenzhen (0.04)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

A Global Depth-Range-Free Multi-View Stereo Transformer Network with Pose Embedding

Neural Information Processing SystemsOct-10-2025, 06:30:51 GMT

In this paper, we propose a novel multi-view stereo (MVS) framework that gets rid of the depth range prior. Unlike recent prior-free MVS methods that work in a pair-wise manner, our method simultaneously considers all the source images. Specifically, we introduce a Multi-view Disparity Attention (MDA) module to aggregate long-range context information within and across multi-view images.

artificial intelligence, machine learning, natural language, (16 more...)

Neural Information Processing Systems

Country:

Europe > Netherlands > North Holland > Amsterdam (0.04)
Asia > China (0.04)

Genre: Research Report > Experimental Study (0.93)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)

Add feedback

Era3D: High-Resolution Multiview Diffusion Using Efficient Row-wise Attention

Neural Information Processing SystemsOct-10-2025, 04:41:31 GMT

In this paper, we introduce Era3D, a novel multiview diffusion method that generates high-resolution multiview images from a single-view image.

arxiv preprint arxiv, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Oklahoma > Beaver County (0.04)
Asia > Japan > Honshū > Chūbu > Nagano Prefecture > Nagano (0.04)
Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
Asia > China > Hong Kong (0.04)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.67)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(2 more...)

Add feedback

45ed1a72597594c097152ef9cc187762-Paper-Conference.pdf

Neural Information Processing SystemsOct-10-2025, 00:58:07 GMT

artificial intelligence, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country:

Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
North America > United States > Oklahoma > Beaver County (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
(8 more...)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)
Research Report > Promising Solution (0.67)

Industry:

Information Technology (0.67)
Health & Medicine (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

38e511a690709603d4cc3a1c52b4a9fd-Paper-Conference.pdf

Neural Information Processing SystemsAug-14-2025, 06:18:07 GMT

artificial intelligence, machine learning, transformer, (13 more...)

Neural Information Processing Systems

Country: Asia > China > Guangdong Province > Shenzhen (0.04)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Epipolar Attention Field Transformers for Bird's Eye View Semantic Segmentation

Witte, Christian, Behley, Jens, Stachniss, Cyrill, Raaijmakers, Marvin

arXiv.org Artificial IntelligenceDec-2-2024

Spatial understanding of the semantics of the surroundings is a key capability needed by autonomous cars to enable safe driving decisions. Recently, purely vision-based solutions have gained increasing research interest. In particular, approaches extracting a bird's eye view (BEV) from multiple cameras have demonstrated great performance for spatial understanding. This paper addresses the dependency on learned positional encodings to correlate image and BEV feature map elements for transformer-based methods. We propose leveraging epipolar geometric constraints to model the relationship between cameras and the BEV by Epipolar Attention Fields. They are incorporated into the attention mechanism as a novel attribution term, serving as an alternative to learned positional encodings. Experiments show that our method EAFormer outperforms previous BEV approaches by 2% mIoU for map semantic segmentation and exhibits superior generalization capabilities compared to implicitly learning the camera configuration.

artificial intelligence, machine learning, segmentation, (19 more...)

arXiv.org Artificial Intelligence

2412.01595

Country:

North America > United States (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Germany > North Rhine-Westphalia > Cologne Region > Bonn (0.04)
Asia > Singapore (0.04)

Genre: Research Report (1.00)

Industry:

Transportation > Ground > Road (0.35)
Information Technology > Robotics & Automation (0.35)
Automobiles & Trucks (0.35)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

Exploiting Motion Prior for Accurate Pose Estimation of Dashboard Cameras

Lu, Yipeng, Zhao, Yifan, Wang, Haiping, Ruan, Zhiwei, Liu, Yuan, Dong, Zhen, Yang, Bisheng

arXiv.org Artificial IntelligenceSep-27-2024

Dashboard cameras (dashcams) record millions of driving videos daily, offering a valuable potential data source for various applications, including driving map production and updates. A necessary step for utilizing these dashcam data involves the estimation of camera poses. However, the low-quality images captured by dashcams, characterized by motion blurs and dynamic objects, pose challenges for existing image-matching methods in accurately estimating camera poses. In this study, we propose a precise pose estimation method for dashcam images, leveraging the inherent camera motion prior. Typically, image sequences captured by dash cameras exhibit pronounced motion prior, such as forward movement or lateral turns, which serve as essential cues for correspondence estimation. Building upon this observation, we devise a pose regression module aimed at learning camera motion prior, subsequently integrating these prior into both correspondences and pose estimation processes. The experiment shows that, in real dashcams dataset, our method is 22% better than the baseline for pose estimation in AUC5\textdegree, and it can estimate poses for 19% more images with less reprojection error in Structure from Motion (SfM).

artificial intelligence, correspondence, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2409.18673

Country:

Asia > China > Hubei Province > Wuhan (0.04)
Asia > China > Hong Kong (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.70)

Industry: Media (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision > Video Understanding (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Collaborating Authors

epipolar line

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

A Global Depth-Range-Free Multi-View Stereo Transformer Network with Pose Embedding

Era3D: High-Resolution Multiview Diffusion Using Efficient Row-wise Attention

Epipolar-Free 3D Gaussian Splatting for Generalizable Novel View Synthesis

38e511a690709603d4cc3a1c52b4a9fd-Paper-Conference.pdf

A Global Depth-Range-Free Multi-View Stereo Transformer Network with Pose Embedding

Era3D: High-Resolution Multiview Diffusion Using Efficient Row-wise Attention

45ed1a72597594c097152ef9cc187762-Paper-Conference.pdf

38e511a690709603d4cc3a1c52b4a9fd-Paper-Conference.pdf

Epipolar Attention Field Transformers for Bird's Eye View Semantic Segmentation

Exploiting Motion Prior for Accurate Pose Estimation of Dashboard Cameras